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Abstract —A new roll-forward technique is proposed that 
recovers from any single fail-stop failure in M integer data 
streams (M > 3) when undergoing linear, sesquilinear or bijec¬ 
tive (LSB) operations, such as: scaling, additions/subtractions, 
inner or outer vector products and permutations. In the pro¬ 
posed approach, the M input integer data streams are linearly 
superimposed to form M numerically entangled Integer data 
streams that are stored in-place of the original inputs. A series 
of LSB operations can then be performed directly using these 
entangled data streams. The output results can be extracted 
from any M -1 entangled output streams by additions and 
arithmetic shifts, thereby guaranteeing robustness to a fail- 
stop failure in any single stream computation. Importantly, 
unlike other methods, the number of operations required 
for the entanglement, extraction and recovery of the results 
is linearly related to the number of the inputs and does 
not depend on the complexity of the performed LSB oper¬ 
ations. We have validated our proposal in an Intel processor 
(Haswell architecture with AVX2 support) via convolution 
operations. Our analysis and experiments reveal that the 
proposed approach incurs only 1 . 8 % to 2 . 8 % reduction in 
processing throughput in comparison to the failure-intolerant 
approach. This overhead is 9 to 14 times smaller than that of 
the equivalent checksum-based method. Thus, our proposal 
can be used in distributed systems and unreliable processor 
hardware, or safety-critical applications, where robustness 
against fail-stop failures becomes a necessity. 

Index Terms —linear operations, sum-of-products, 
algorithm-based fault tolerance, fail-stop failure, numerical 
entanglement 


I. Introduction 

T he INCREASE of integration density |[T] and 
aggressive voltage/frequency scaling in processor 
and custom-hardware designs 0, along with the ever- 
increasing tendency to use commercial off-the-shelf pro¬ 
cessors to create vast computing clusters, have decreased 
the mean-time-to-failure of modern computing systems. 
Therefore, it is now becoming imperative for distributed 
computing systems to provide for fail-stop failure miti¬ 
gation m, i.e., recover from cases where one of their 
processor cores becomes unresponsive or does not return 
the results within a predetermined deadline. Applications 
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that are particularly prone to fail-stop failures include 
distributed systems like grid computing a, sensor-network 
a, webpage, or multimedia retrieval and object or face 
recognition in images a, financial computing ||71, etc. The 
compute- and memory-intensive parts of these applications 
comprise linear, sesquilinear (also known as “one-and-half 
linear”) and bijective operations, collectively called LSB 
operations in this paper. These operations are typically 
performed using single or double-precision floating-point 
inputs or, for systems requiring exact reproducibility and/or 
reduced hardware complexity, 32-bit or 64-bit integer or 
fixed-point inputs. Thus, ensuring robust recovery from 
fail-stop failures for applications comprising integer LSB 
operations is of paramount importance. 

A. Summary of Prior Work 

Existing techniques that can ensure recovery from fail- 
stop failures comprise two categories; (i) roll-back via 
checkpointing and recomputation lO, |j9l, ic-, methods that 
periodically save the state of all running processes, such 
that the execution can be rolled back to a “safe state” in case 
of failures; (ii) roll-forward methods producing additional 
“checksum” inputs/outputs 0-01 such that the missing 
results from a core failure can be recovered from the 
remaining cores without recomputation. Examples of roll- 
forward methods include algorithm-based fault-tolerance 
(ABET) and modular redundancy (MR) methods 0- IIT^ . 
Although no recomputation is required in roll-forward 
methods (thereby ensuring quick recovery from a failure 
occurrence), checksum-based methods can incur significant 
computational and energy-consumption overhead because 
of the additional checksum-generation and redundant com¬ 
putations required ITTll . 

B. Contribution 

We propose a new roll-forward failure-mitigation method 
for linear, sesquilinear (also known as one-and-half linear) 
or bijective operations performed in integer data streams. 
Examples of such operations are element-by-element addi¬ 
tions and multiplications, inner and outer vector products, 
sum-of-squares and permutation operations. They are the 
building blocks of algorithms of foundational importance, 
such as: matrix multiplication m, ED, convolution/cross¬ 
correlation ED, template matching for search algorithms 
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EOl, covariance calculations 0, integer-to-integer trans¬ 
forms m and permutation-based encoding systems Il22l . 
which form the core of the applications discussed ear¬ 
lier. Because our method performs linear superpositions 
of input streams onto each other, it “entangles” input 
streams together and we term it as numerical entanglement. 
Our approach guarantees recovery from any single stream¬ 
processing failure without requiring recomputation. Impor¬ 
tantly, numerical entanglement does not generate additional 
“checksum” or duplicate streams and does not depend on 
the specifics of the LSB operation performed. It is therefore 
found to be extremely efficient in comparison to checksum- 
based methods that incur overhead proportional to the 
complexity of the operation performed. 

C. Paper Organization 

In Section |II] we introduce checksum based methods 
and MR for fail-stop failure recovery in numerical stream 
processing. In Section |III] we introduce the notion of 
numerical entanglement and demonstrate its inherent re¬ 
liability for LSB processing of integer streams. Section HVl 
presents the complexity of numerical entanglements within 
integer linear or sesquilinear operations. Section FV] presents 
experimental comparisons and Section |Vl] presents some 
concluding remarks. 

II. Checksum/MR-based Methods versus 
Numerical Entanglement 

Consider a series of M input streams of integers, each 
comprising TVin sample^ (M > 3): 

~ l] 5 0 — < D/1. (1) 

These may be the elements of M rows of a matrix of 
integers, or a set of M input integer streams of data to 
be operated upon with an integer kernel g. This operation 
is performed by: 

Vm : d„i = opg 

op€|-i-,-,x,(.,.),®,| ^ j,*| (2) 

with dm the mth vector of output results (containing 
A^out values) and op any LSB operator such as element- 
by-element addition/subtraction/multiplication, inner/outer 
product, permutatioifl (i.e., bijective mapping from the 

^Notations: Boldface uppercase and lowercase letters indicate matrices 
and vectors, respectively; the con'esponding italicized lowercase indicate 
their individual elements, e.g. A and am,n‘, d denotes the recovered value 
of d after disentanglement; all indices are integers. Operators: superscript 
T denotes transposition; [aj is the largest integer that is smaller or equal 
to a (floor operation); [a] is the smallest integer that is larger or equal to 
a (ceil operation); a «b and a b indicate left and right aiithmetic shift 
of integer a by 6 bits with truncation occurring at the most-significant 
or least significant bit, respectively; a mod 6 = a - & is the modulo 

operation. 

^We remark that we consider LSB operations that are not data- 
dependent, e.g., permutations according to fixed index sets as in the 
Bun'ows-Wheeler transform (22) 


sequential index set 3 to index set © corresponding to 
g) and circular convolution or cross-correlation with g. 
Beyond the single LSB operator indicated in (|2]), we can 
also assume series of such operators applied consecutively 
in order to realize higher-level algorithmic processing, e.g., 
multiple consecutive additions, subtractions and scaling 
operations with pre-established kernels followed by circular 
convolutions and permutation operations. Conversely, the 
input data streams can also be left in their native state (i.e., 
stored in memory), if op = {x} and g = 1. 

A. Checksum-based Methods 

In their original (or “pure”) form, the input data streams 
of O are uncorrelated and one input or output element 
cannot be used for the recovery of another without inserting 
some form of coding or redundancy. This is conventionally 
achieved via checksum-based methods jj), ifTOll . ifT^ - lfTSll . 
II 23 I . Specifically, one additional input stream is created, 
which comprises checksums of the original inputs: 

r = [to ■ • ■ TWi„-i], (3) 

by using, for example, the sum of groups of M input 
samples m, M at position n in each stream, 0 < n < A^in: 

M-l 

Vn : r„ = Y, (4) 

m=0 

Then the processing is performed in all input streams 
Co,...,cm-i and in the checksum input stream r (each 
running on a different core) by: 


do 


Co 

dM-i 

= 

CM-1 

e 


r 


Any single fail-stop core failure in the group of M+1 cores 
executing (|5]l can be recovered from using the remaining 
M output streams. As discussed in partitioning schemes 
for checksum-based methods and ABET iflTll . ifTSll . the 
recovery capability can be increased by using additional 
weighted checksums. 

B. Proposed Numerical Entanglement 

Numerical entanglement mixes the inputs prior to pro¬ 
cessing using linear superposition, and ensures the results 
can be recovered via a mixture of shift-add operations. 
Specifically, considering M (M > 3) input streams c^, 
0 < m < M (each comprising Nin integer samples), each 
element of the mth entangled stream denoted by €m,n 
(0 < n < Nia), comprises the superposition of two input 
elements Cx,n and Cy^n from different input streams x and 
y, i.e., 0 < x,y < M and x + y. The LSB operation op with 
kernel g is carried out with M independent cores utilizing 
the entangled input streams directly, thereby producing the 
entangled output streams Sm (each comprising A^out integer 
samples). These can be disentangled to recover the final 
results djn- Any single fail-stop failure in the M processor 
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cores can be recovered from the results of the remaining 
M - 1 cores utilizing additions and shift operations. 

The complexity of entanglement, disentanglement (ex¬ 
traction) and recovery does not depend on the complexity 
of the operator op, or on the length of the kernel (operand) 
g. The entangled inputs can be written in-place and no addi¬ 
tional storage or additional operations are needed during the 
execution of the actual operation. The entire process is also 
suitable for stream processors with entanglement applied 
as data within each input stream is being read. Unlike 
checksum or MR methods, numerical entanglement does 
not use additional processor cores, and the only detriment 
is that the dynamic range of the entangled inputs is 
somewhat increased in comparison to the original inputs 
Cm- However, as it will be demonstrated in the next section, 
this increase depends on the number of jointly-entangled 
inputs, M, i.e., the desired failure recovery capability. 
Therefore, one can be traded for the other. 

III. Numerical Entanglement for Fail-Stop 
Reliability in LSB Operations 

We first illustrate our approach via its simplest instanti¬ 
ation, i.e., entanglement of M = 3 inputs, and then present 
its general application and discuss its properties. 

A. Numerical Entanglement in Groups of M = 3 Inputs 

1) Entanglement: In the simplest form of entanglement 
(M = 3), each triplet of input samples of the three integer 
streams, co^n, Ci „ and C 2 ,n, 0 < n < produces the 
following entangled triplet via the superposition operations: 


Vm : Sm = (cmopg). (9) 

A conceptual illustration of the entangled outputs after (|6]l 
and @ is given in Fig.[T] Our description until this point in¬ 
dicates a key aspect: I bits of dynamic range are used within 
each entangled input/output in order to achieve recovery 
from one fail-stop failure occurring in the computation of 
So,n, Si,n or 1 ^ 2 ,n- As a practical instantiation of (|6]l, we 
can set w = 32, I = 11 and fc = 10 in a signed 32-bit integer 
configuration. 
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Figure 1. Illustration of three entangled outputs after integer LSB 
processing. The solid arrows indicate the maximum attainable dynamic 
range of each output do^n, and d 2 ,n- The dotted rectangles and 

an'ows illustrate that the contents of entangled output are contained 
within the two other entangled outputs. 


£0,n “ {C2,n} T Co,n 

{co,n} 

e2,n = Si {ci^n} + C2,n 


We now describe the disentanglement and recovery pro¬ 
cess. The reader can also consult Fig. [T] 

2) Disentanglement and Recovery: We can disentangle 
the outputs by (0 < n < Wout): 


where: 


5z{c}e 


{c« 1), if Z > 0 
[c » (-Z)], if Z < 0 


(7) 


is the left or right arithmetic shift of c by Z bits. If we 
assume that the utilized integer representation comprises 
w bits, the Z-bit left-shift operations of (|6]l must be upper- 
bounded by w to avoid overflow. Therefore, if the dynamic 
range of the input streams Cq, Ci, C 2 is Z + fc bits: 


2l+k<w (8) 

in order to ensure no overflow happens from the arithmetic 
shifts of (|6]l. The values for Z and k are chosen such that 
Z + fc is maximum within the constraint of (|8j and k < 1. 
Via the application of FSB operations, each entangled 
input stream (0 < m < M) is converted to the entangled 
output strearro dm (which contains Aj,ut values): 

^For the particular cases of: op e {+, —g must also be entangled with 
itself via: Qn {Qu} + 9n, in order to retain the homomorphism of 

the performed operation. 


^temp “ ^2,n 

^2,71 - ^-2{w-l) {‘^2( w-l) {^temp}]■ 

^0,n ~ ^-2/ (^temp “ ^2,n)} (10) 

The first three parts of (doll assume a 2ir;-bit integer 
representation is used for the interim operations, as the 
temporary variable cZtemp is stored in 2r(;-bit integer rep¬ 
resentation. However, all recovered outputs, dg^n, di,n and 
(^ 2 , 71 . require only w bits. 

Explanation of (fTOl l— see also Fig. [1] The first part 
creates a composite number comprising do^n in the Z + k 
most-significant bits and d2^n in the 2Z least-significant bits 
(therefore, dtemp requires 31 + k bits). In the second part, 
d 2 ,n is extracted by: (i) discarding the (2w-2Z) most- 
significant bits; (ii) arithmetically shifting the output down 
to the correct range. The third part of (fTOl i uses d2,n to 
recover dg.n and, in the fourth part of (fTOl i, dp „ is used to 
recover 
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Remark 1 (operations within w bits): To facilitate our 
exposition, the first three parts of ( fTOl i are presented under 
the assumption of a 2r(;-bit integer representation. However, 
it is straightforward to implement them via w-bit integer 
operations by separating dtemp into two parts of w bits and 
performing the operations separately within these parts. 

Remark 2 (recovery without the use of Notice that 
(113 does not use (5o,n- This is a crucial element of our 
approach: since and d 2 ,n were derived without 

using (5o,n, full recovery of all outputs takes place even 
with the loss of one entangled stream. We are able to do 
this because, for every n, 0 < n < Si n und S2^n 

contain dp,™ und d 2 ^n, which suffice to recreate dp.n if the 
latter is not available due to a fail-stop failure. This link is 
pictorially illustrated in Fig. [T] Since the entangled pattern 
is cyclically-symmetric, it is straightforward to demonstrate 
that recovery from loss of any single out the three output 
streams is possible following the same approach. 

Remark 3 (dynamic range): Bit I + k within each re¬ 
covered output do.n, di,n uud d 2 ,ra represents its sign bit. 
Given that: (i) each entangled output comprises the addition 
of two outputs (with one of them left-shifted by I bits); (ii) 
the entangled outputs must not exceed 21+k bits, we deduce 
that the outputs of the LSB operations must not exceed the 
range 


Vn : do.n, rfi.n, d2,n € {" - 2*) , 2'^'=-! - 2'} . 

( 11 ) 

Therefore, (fTTl i comprises the range permissible for the 
LSB operations of (|9]l with the entangled representation of 
©. Thus, we conclude that, for integer outputs produced 
by the LSB operations of (|9]l with range bounded by (fTTTl . 
the extraction mechanism of (fTOl l is necessary and sufficient 
for the recovery of any single stream in do,n, ^ 2 ,n for 
all stream positions n, 0 < n < A^out- 


1 

Si 

0 

0 


0 - 0 Sf 

1-00 

- Si 1 0 


(14) 


with S the circulant matrix operator comprising cyclic 
permutations of the lx M vector [l 0 0 5;]. 

As before, in the generalized entanglement in groups of 
M streams, the values for I and k are chosen such that 
I + k is maximum within the constraint of (fTSI) and k < 1. 
Moreover, the exact same principle applies, i.e., pairs of 
inputs are entangled together (with one of the two shifted 
by I bits) to create each entangled input stream of data. Any 
LSB operation is then performed directly on these input 
streams and any single fail-stop failure will be recoverable 
within each group of M outputs. For every input stream 
position n, 0 < n < Win, the entanglement vector performing 
the linear superposition of pairs out of M inputs is now 
formed by: 


[^0,n **' ^M-l,n] — S |^[co,n *** CM-l,n] j" • (15) 

After the application of (|9]l, we can disentangle every 
output stream element 6m,n, 0 < n < Wout, as follows. We 
hrst identify the unavailable entangled output stream 6r 
(with 0 < r < M) due the single core failure. Then, we 
produce the 2ui-bit temporary variable dtemp by: 


M-2 

^temp ~ E (-1)'" S(^M-2-m)l {^(r+l+m)modM,n} • (lb) 

m=0 

Notice that (fTST l does not use Sr- We can then extract the 
value of dr,n and d(r-+M-i)modM,n directly from dtemp: 


t^(r+M-l)modM,n “ S_[2w-(M-l)l] {‘52u)-(M-l)i {d temp} } 

(17) 


B. Generalized Entanglement in Groups of M Inputs (M > 

3) 

We extend the proposed entanglement process to using 
M inputs and providing M entangled descriptions, each 
comprising the linear superposition of two inputs. This 
ensures that, for every n (0 < n < Ngud, any single failure 
will be recoverable within each group of M output samples. 
The condition for ensuring that overflow is avoided is 

{M-l)l + k<w (12) 

and the dynamic range supported for all outputs is (Vm, n): 

dm n 6 (2'-l - 1) , 2(^-3)'+'= (2^-^ - l)} 

(13) 

We now define the following operator that generalizes the 
proposed numerical entanglement process: 


dr,n ~ 1) (^temp dM—ljTi^'^ - (13) 

The other outputs can now be disentangled by (1 < m < 

M-2): 

^TTl - d{^r+ni)modM,n “ ^(r+m)modM,n ~ Si \^d(^r-^-m-l)modM,n 

(1 

Given that for every output position n we are able to recover 
all results of all M streams without using 6r,n in (I16l l- (ll9l l. 
the proposed method is able to recover from a single fail- 
stop failure in one of the M entangled streams. 

Remark 4 (dynamic range of generalized entanglement 
and equivalence to checksum methods): Examples for the 
maximum bitwidth achievable for different cases of M 
are given in Table |T] assuming a 32-bit representation. We 
also present the dynamic range permitted by the equiva¬ 
lent checksum-based method [(O-®] in order to ensure 
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that its checksum stream does not overflow under a 32- 
bit representation. Evidently, for M < 10, the proposed 
approach incurs loss of 1 to 9 bits of dynamic range 
against the checksum-based method, while it allows for 
higher dynamic range than the checksum-based method 
for M > 11. At the same time, our proposal does not 
require the overhead of applying the LSB operations to an 
additional stream, as it “overlays” the information of each 
input onto another input via the numerical entanglement 
of pairs of inputs. Beyond this important different, our 
approach offers the exact equivalent to checksum methods 
of (H-dl for integer inputs . Therefore, equivalently to 
checksum methods, beyond recovery from single fail-stop 
failures, our proposal can also be used for the detection of 
silent data corruptions (SDCs) in any input stream, as long 
as such SDCs do not occur in coinciding output stream 
positions. We plan to explore this aspect in future work. 

Table I 

Examples of I and k values and bitwidth supported eor the 

OUTPUT DATA UNDER W = 32 BITS AND: (i) DIEEERENT NUMBERS OF 

ENTANGLEMENTS; fn't CHECKSUM-BASED METHOD OF ANY 

FAILURE IN 1 OUT OF M STREAMS IS GUARANTEED TO BE 
RECOVERABLE UNDER BOTH FRAMEWORKS. 


M 

1 

k 

Maximum bitw 
Proposed: 
(M-2)l + k 

idth supported by 
Checksum-based 
w - [Iog 2 M] 

3 

II 

10 

21 

30 

4 

8 

8 

24 

30 

5 

7 

4 

25 

29 

8 

4 

4 

28 

29 

II 

3 

2 

29 

28 

16 

2 

2 

30 

28 

32 

I 

I 

31 

27 


IV. Complexity in LSB Operations with 
Numerical Entanglement 

Consider M input integer data streams, each comprising 
several samples and consider that an LSB operation op 
with kernel g is applied on each stream. The operations 
count (additions/multiplications) for stream-by-stream sum- 
of-products between a matrix comprising M subblocks of 
N X N integers and a matrix kernel comprising N x N 
integers (see a, Qsi, im, ca for example instantiations) 
is: Cgemm = MN^. Eor sesquilinear operations like convo¬ 
lution and cross-correlation of M input integer data streams 
(each comprising N samples) with kernel g [see Eig. 
[TJa)], depending on the utilized realization, the number of 
operations can range from O for direct algorithms 

(e.g., time-domain convolution) to O {MN\og 2 N) for fast 
algorithms (e.g., EET-based convolution) HD . Eor example, 
for convolution or cross-correlation under these settings and 
an overlap-save realization for consecutive block process¬ 
ing, the number of operations (additions/multiplications) 
is ifT^ : Cconv.time = for time domain processing 

and Cconv,freq ’= M [(45W + 15) loga (3W + 1) + 3A^ + 1] 
for frequency-domain processing. 

As described in Section |III] numerical entanglement of 
M input integer data streams (of N samples each) re¬ 
quires O (MN) operations for the entanglement, extraction 


and recovery per output sample. Eor example, ignoring 
all arithmetic-shifting operations (which take a negligible 
amount of time), based on the description of Section Hill the 
upper bound of the operations for numerical entanglement, 
extraction and recovery is: Cne.conv = 2MN. Similarly 
as before, for the special case of the GEMM operation 
using M subblocks of N x N integers, the upper bound 
of the overhead of numerical entanglement of all inputs is: 
C'ne.GEMM = 2MN^. Eor all values for N and M of practical 
relevance (e.g., 100 < N < 1000 and 3 < M < 32) and 
sesquilinear operations like matrix products, convolution 
and cross-correlation, it can easily be calculated from the 
ratios and that the relative overhead 

of numerical entanglement, extraction and recovery in terms 
of arithmetic operations is below 0.3%. Most importantly. 


ne,conv 


,. Cne.GEMM ,. C 

lim -= lim — 

Cgemm Cconv.time 


= lim 


ne,conv 


= 0 , 


conv,freq 


( 20 ) 

i.e., the relative overhead of the proposed approach ap¬ 
proaches 0% as the dimension of the LSB processing 
increases. 

On the other hand, the overhead of checksum- 
based methods in terms of operations count (addi¬ 
tions/multiplications) for each case is represented by 

Ccs.GEMM “ 2M N + j^CgeMMi Ccs.conv.time ~ 2MN + 
Cconv.time and Ccs.conv.freq ~ 2MN + Cconv.freq ■ -Vs ex¬ 
pected, the relative overhead of checksum methods con¬ 
verges to X100% as the dimension of the LSB processing 
operations increases, i.e.. 


Ccs.GEMM 

lim - 

P 

T. '-^cs.conv.time 

= lim - 

(21) 

N^oo Cgemm 

N^oo Cconv,time 


I. Ccs.conv.freq 

= lim - - 

AT-i-oo Cconv.freq 

1 

M' 

(22) 


Therefore, the checksum-based method for fail-stop mitiga¬ 
tion leads to substantial overhead (above 10%) when high 
reliability is pursued, i.e., when M < 8. Einally, even for 
the low reliability regime (i.e., when M > 8), checksum- 
based methods will incur more than 4% overhead in terms 
of arithmetic operations. 


V. Experimental Validation 

All our results were obtained using an Intel Core i7- 
4700MQ 2.40GHz processor (Haswell architecture with 
AVX2 support, Windows 8 64-bit system, Microsoft Visual 
Studio 2013 compiler). Entanglement, disentanglement and 
fail-stop recovery mechanisms were realized using the Intel 
AVX2 SIMD instruction set for faster processing. Eor 
all cases, we also present comparisons with checksum- 
based recovery, the checksum elements of which were also 
generated using AVX2 SIMD instructions. 

We consider the case of convolution operations of integer 
streams. We used Intel’s Integrated Performance Primitives 
(IPP) 7.0 1261 convolution routine ippsConv_64f that 
can handle the dynamic range required under convolutions 
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Convolution Convolution 



(a) M = 3 (b) M = 8 


Figure 2. Throughout results for convolution of M integer streams. “Conventional” refers to conventional (failure-intolerant) convolution realization 
using the state-of-the-art Intel IPP 7.0 library and it is used as a benchmark under: (a) M = 3; (b) M = 8. 


with 32-bit integer inputs. We experimented with: input 
size of TVin = 10® samples, several kernel sizes between 
Vkemei 6 [100, 4500] Samples. Representative results are 
given in Fig. |2] under two settings for the number of 
input streams, M, and without the occurrence of failures, 
i.e., when operating under normal condition^ The results 
demonstrate that the proposed approach incurs substan¬ 
tially smaller overhead for a single fail-stop mitigation in 
comparison to the checksum-based method. Specifically, 
the decrease in throughput for the proposed approach in 
comparison to the failure-intolerant case is only 1.8% 
to 2.8%, while checksum-based method incurs 16.1% to 
37.8% throughput loss for the same test. As expected 
by the theoretical calculations of Section |IV] this is an 
order-of-magnitude higher than the overhead of numerical 
entanglement. 

VI. Conclusions 

We propose a new approach to fail-stop failure recovery 
in linear, sesquilinear and bijective (LSB) processing of 
integer data streams that is based on the novel concept of 
numerical entanglement. Under M input streams (M > 3), 
the proposed approach provides for: (i) guaranteed recovery 
from any single fail-stop failure; (ii) complexity overhead 
that depends only on M and not on the complexity of the 
performed LSB operations, thus, quickly becoming negli¬ 
gible as the complexity of the LSB operations increases. 
These two features demonstrate that the proposed solution 
forms a third family of recovery from fail-stop failures 
(i.e., beyond the well-known and widely-used checksum- 
based methods and modular redundancy) and offers unique 
advantages. As such, it is envisaged that it will find usage 
in a multitude of systems that require enhanced reliability 
against core failures in hardware with very low implemen¬ 
tation overhead. 

“^Under the occurrence of one fail-stop failure, the performance of the 
proposed approach remains the same as the results are disentangled as 
soon as (any) M -1 output streams become available. On the other hand, 
the performance of the checksum-based approach will decrease slightly 
under a fail-stop failure, since results will need to be recovered from the 
checksum stream. 
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